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This is a method for reproducing in vitro the serine protease activity associated with the HCV NS3 protein, that comprises the use both 
of sequences contained in NS3 and sequences contained in NS4A. This method takes advantage of the ability of the HCV NS4A protein, 
° r i^?^? CC l^ taincd therein * toactasa cofactor °f *e serine protease activity or more generally of the enzymatic activities associated 
with NS3. Optimal serine protease activity is obtained when NS4A is present in a molar ratio of at least 1:1 with NS3. NS3 and NS4A 
can also be incorporated in the reaction mixture as NS3-NS4A precursor, as this precursor will generate, by means of an autoproteolytic 
event, equimoiar amounts of NS3 and NS4A. It is also possible to mutate the cleavage site between NS3 and NS4A in a precursor, so 
that NS4A remains covalently bonded to NS3. The sequences that do not influence the proteolytic activity of NS3 can subsequently be 
^« v2/?i d ^pro^dyzaWe precursor. The invention also relates to a composition of matter that comprises sequences contained 
in NS3 and NS4A, and to the use of these compositions for the setup of an enzymatic test capable of selecting, for therapeutic purposes, 
compounds that inhibit the enzymatic activity associated with NS3. The figure shows plasmidic vectors used in the method to activate HCV 
NS3 protease in cultivated cells and in vitro. 
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METHOD FOR REPRODUCING IN VITRO THE PROTEOLYTIC ACTIVITY 
OF THE NS3 PROTEASE OF HEPATITIS C VIRUS (HCV) 

DFlfiPP TPTTOW 

The present invention has as its subject a method for 
reconstituting the serine protease activity associated 
with the HCV NS3 protein, which makes use of the ability 
of the HCV protein NS4A, or sequences contained therein, 
to act as a cofactor of the serine protease activity or 
more generally speaking of enzymatic activities 
associated with NS3. 

As is known, the hepatitis c virus (HCV) is the 
main etiological agent of non-A, non-B hepatitis (NANB) . 
It is estimated that HCV causes at least 90% of post- 
transfusional NANB viral hepatitis and 50% of sporadic 
NANB hepatitis. Although great progress has been made 
in the selection of blood donors and in the 
immunological characterization of blood used for 
transfusions, there is still a high level of acute HCV 
infection among those receiving blood transfusions (one 
million or more infections every year throughout the 
world) . Approximately 50% of HCV-infected individuals 
develop cirrhosis of the liver within a period that can 
range from 5 to 40 years. Furthermore, recent clinical 
studies suggest that there is a correlation between 
chronic HCV infection and the development of 
hepatocellular carcinoma. 

HCV is an enveloped virus containing an RNA 
positive genome of approximately 9.4 kb. This virus is 
a member of the Flaviviridae family, the other members 
of which are the flaviviruses and the pestiviruses . The 
RNA genome of HCV has recently been mapped. Comparison 
of sequences from the HCV genomes isolated in various 
parts of the world has shown that these sequences can be 
extremely heterogeneous. The majority of the HCV genome 
is occupied by an open reading frame (ORF) that can vary 
between 9030 and 9099 nucleotides. This ORF codes for a 
single viral polyprotein, the length of which can vary 
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from 3010 to 3033 amino acids. 



During the viral 



infection cycle, the polyprotein is proteolytically 
processed into the individual gene products necessary 
for replication of the virus. The genes coding for HCV 
structural proteins are located at the 5 1 -end of the 
ORF, whereas the region coding for the non- structural 
proteins occupies the rest of the ORF. 

The structural proteins consist of C (core, 21 
kDa) , El (envelope, gp37) and E2 (NS1, gp61) . C is a 
non-glycosylated protein of 21 kDa which probably forms 
the viral nucleocapsid. The protein El is a 

glycoprotein of approximately 37 kDa and it is believed 
to be a structural protein for the outer viral envelope. 
E2, another membrane glycoprotein of 61 kDa, is probably 
a second structural protein in the outer envelope of the 
virus. 

The non- structural region starts with NS2 (p24) , a 
hydrophobic protein of 24 kDa whose function is unknown. 
NS3 , a protein of 68 kDa which follows NS2 in the 
polyprotein, is predicted to have two functional 
domains: a serine protease domain in the first 200 
amino- terminal amino acids, and an RNA-dependent ATPase 
domain at the carboxy terminus. The gene region 
corresponding to NS4 codes for NS4A (p6) and NS4B (p26) , 
two hydrophobic proteins of 6 and 26 kDa, respectively, 
whose functions have not yet been clarified. The gene 
corresponding to NS5 also codes for two proteins, NS5A 
<p56) and NS5B (p65) , of 56 and 65 kDa, respectively. 
An amino acid sequence present in all the RNA-dependent 
RNA polymerases can be recognized within the NS5 
region. This suggests that the NS5 region contains 
parts of the viral replication machinery. 

Various molecular biological studies indicate that 
the signal peptidase, a protease associated with the 
endoplasmic reticulum of the host cell, is responsible 
for proteolytic processing in the non- structural region, 
that is to say at sites C/El, E1/E2 and E2/NS2. The 
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serine protease contained in NS3 is responsible for 
cleavage at the junctions between NS3 and NS4A, between 
NS4A and NS4B, between NS4B and NS5A and between NS5A 
and NS5B. In particular it has been found that the 
cleavage performed by this serine protease leaves a 
residue of cysteine or threonine on the amino- terminal 
side (position PI) and an alanine or serine residue on 
the carboxy- terminal side (position PI 1 ) of the scissiie 
bond. A second protease activity of HCV appears to be 
responsible for the cleavage between NS2 and NS3 . This 
protease activity is contained in a region comprising 
both part of NS2 and the part of NS3 containing the 
serine protease domain, but does not use the same 
catalytic mechanism. 

In the light of the above description, the NS3 
protease is considered a potential target for the 
development of ant i -HCV therapeutic agents. However, the 
search for such agents has been hampered by the evidence 
that the serine protease activity displayed by NS3 in 
vitro is too low to allow screening of inhibitors. 

It has now been unexpectedly found that this 
important limitation can be overcome by adopting the 
method according to the present invention, which also 
gives additional advantages that will be evident from 
the following. 

According to the present invention, the method to 
reproduce in vitro the proteolytic activity of the 
protease NS3 of HCV is characterized by using in the 
reaction mixture, both sequences contained in NS3 and 
sequences contained in NS4A. 

Optimal serine protease activity is obtained when 
NS4A is present in a ratio of 1:1 with NS3 . 

NS3 and NS4A can also be incorporated in the 
reaction mixture as NS3-NS4A precursor, as this 
precursor will generate, by means of an autoproteolytic 
event, equimolar amounts of NS3 and NS4A. 
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It is also possible to mutate the site of cleavage 
between NS3 and NS4A, in a precursor, so that NS4A 
remains covalently bound to NS3 . The sequences that do 
not influence the proteolytic activity of NS3 can 
subsequently be removed from this non-proteolyzable 
precursor . 

The invention also extends to a new composition of 
matter, characterized in that it comprises proteins 
whose sequences are described in SEQ ID N0:1 and SEQ ID 
NO: 2 or sequences contained therein or derived 
therefrom. It is understood that these sequences may 
vary in different HCV isolates, as all the RNA viruses 
show a high degree of variability. This new composition 
of matter has the proteolytic activity necessary to 
obtain the proteolytic maturation of several of the non- 
structural HCV proteins. 

The present invention also has as its subject the- 
use of these compositions of matter in order to prepare 
an enzymatic assay capable of identifying, for 
therapeutic purposes, compounds that inhibit the 
enzymatic activity associated with NS3,, including 
inhibitors of the interaction between NS3 and NS4A. 

Up to this point a general description has been 
given of the present invention. With the aid of the 
following examples, a more detailed description of 
specific embodiments thereof will now be given, in order 
to give a clearer understanding of its objects, 
characteristics, advantages and method of operation. 

The figure illustrates plasmid vectors used in the 
method to activate the HCV NS3 protease in cultivated 
cells and in vitro (example 1 and example 2) . 

EXAMPLE 1 

method Q£ activation" Q£ H£2£ HS2 SERINE PROTEASE IB 

OTTiTTVATED SELLS. 

Plasmid vectors were constructed for expression of 
NS3, NS4A and other non- structural HCV proteins in HeLa 
cells. The plasmids constructed are schematically 
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illustrated in figure 1. Selected fragments of the cDNA 
corresponding to the genome of the HCV BK isolate (HCV- 
BK) were cloned downstream of the promoter of the 
bacteriophage T7 in the plasmid vector pCite-l R 
(Novagen) . This expression vector contains the internal 
ribosome entry site of the encephalomyocarditis virus, 
so as to guarantee an effective translation of the 
messenger RNA transcribed from promoter T7, even in the 
absence of a CAP structure. 

The various fragments of HCV-BK cDNA were cloned 
into the plasmid pCite-l R using methods known in 
molecular biology practice. pCite(NS3) contains the 
portion of the HCV-BK genome comprised between 
nucleotides 3351 and 5175 (amino acids 1007-1615 of the 
polyprotein) . pCite (NS4B/5A) contains the portion of 
the HCV-BK genome comprised between the nucleotides 5652 
and 7467 (amino acids 1774-2380). pCite (NS3/4A) contains 
the portion of the HCV-BK genome comprised between the 
nucleotides 3711 and 5465 (amino acids 991 and 1711) . 
pCite(NS4A) contains the portion of the HCV-BK genome 
comprised between the nucleotides 5281 and 5465 (amino 
acids 1649-1711). pCite(NSSAB) contains the portion of 
the HCV-BK genome comprised between the nucleotides 6224 
and 9400 (amino acids 1965-3010) . The numbering given 
above agrees with the sequences for the genome and the 
polyprotein given for HCV-BK in Takamizawa et al, 
Structure and organization of the hepatitis C virus 
genome, isolated from human carriers, (1991), J. Virol. 
65, 1105-1113. 

In order to obtain efficient expression of the 
various portions of the HCV polyprotein, the HeLa cells 
were infected with vTF7-3, a recombinant vaccinia virus 
which allows synthesis of the ' RNA polymerase of the 
bacteriophage T7 in the cytoplasm of infected cells. 
These cells, after infection, were then transfected with 
plasmid vectors selected from among those described in 
figure. The HeLa cells thus infected and transfected 
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were then metabolically labelled with [ 35 S] methionine 
and the recombinant proteins encoded by the various 
plasmids could be identified by immunoprecipitation with 
polyclonal rabbit antibodies that recognize NS3, NS4 or 
NS5A. The method described in the present example for 
analysis of recombinant HCV proteins has already been 
described in L. Tomei et al, n NS3 is a serine protease 
required for processing of hepatitis C virus 
polyprotein", J. Virol. (1993) 67, 1017-1026 and in the 
bibliography mentioned therein. 

By transfecting the plasmid pCite(NS3) into the 
HeLa cells infected with vTF7-3, it is possible to 
observe the synthesis of a protein containing the 
catalytic domain of the HCV NS3 protease. pCite (NS4B5A) 
codes for a portion of the HCV polyprotein containing a 
peptide bond, at the junction between NS4B and NS5B, 
which would be expected to be hydrolyzed by the serine 
protease activity associated with NS3. However, when 
pCiteNS3 is cotransf ected with pCiteNS4B5A, there is no 
evidence of proteolytic celavage. Conversely, when the 
NS3 serine protease domain is expressed in combination 
with NS4A the proteolytic cleavage of the precursor 
encoded by pCite (NS4B5A) can take place normally. 
Coexpression of the NS3 serine protease domain and 4A 
can be achieved, for example, by transfection with 
equimolar amounts of the plasmids pCite{NS3) and 
pCite (NS4A) , by transfection of a plasmid coding for a 
precursor containing both NS3 and NS4A [pCite (NS34A) ] , 
or by transfection of a derivative of the latter plasmid 
to which all the sequence that are not relevant for 
proteolysis have been deleted [pCite (NS34A) ] , or by 
transfection of a derivative of the latter plasmid to 
which all the sequence that are not relevant for 
proteolysis have been deleted [pCite <NS3Aintl237- 
1635)]. NS4A expressed transiently in HeLa cells can 
thus activate the proteolytic activity associated with 
NS3 , which otherwise would not be seen. 
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EXAMPLE 2 

method eqr acttvatton QE THE HO£ SERINE N£2 PROTKASE IH 
M IH VITRO TRANSLATION ASSAY 

The plasmids described in figure 1 can also be used 
for in vitro synthesis of mRNA coding for the respective 
HCV proteins using the purified RNA polymerase enzyme of 
the phage T7 (Promega) . 

Generally the plasmids derived from pCite-l R were 
linearized using suitable restriction enzymes and 
transcribed using the protocols supplied by the 
manufacturer (Promega) • These synthetic mRNA, could 
later be used to synthesize the corresponding proteins 
in extracts of rabbit reticulocytes in the presence of 
canine pancreas microsomal membranes. The reticulocyte 
extracts, the canine pancreas microsomal membranes, like 
all the other material required, were purchased from 
Promega, which also supplied the instructions for the in 
vitro protein syntheses process described above. 

Programming the in vitro translation mixture with 
mRNA transcribed from pCite(NS3) it is possible to 
observe synthesis of a protein with the expected 
molecular weight (68 kDa) containing the entire NS3 
serine protease domain. The mRNA transcribed from 
pCite(NSSAB) guides the synthesis of a precursor of 115 
kDa which contains NS5A and NS5B and is thus a substrate 
for the proteolytic activity associated with NS3. 

However, when the two proteins, containing the NS3 
serine protease domain and the substrate with the site 
corresponding to the junction between NS5A and NS5B, are 
synthesized in the same reaction mixture, there is no 
clear evidence of the proteolytic activity of NS3 . 

On the contrary, the mRNA transcribed from 
pCite(NS34A) is translated into a precursor protein of 
approximately 76 kDa which self -processes 
proteolytically in vitro to give equimolar amounts of 
two proteins of 70 kDa and 6kDa, containing NS3 and 
NS4A, respectively. 
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If, in addition to the mRNA transcribed from 
pCite (NS34A) , the mRNA transcribed from pCite(NSSAB) is 
included in the in vitro translation mixture, there can 
be observed, in addition to the self -proteolysis at the 
site between NS3 and NS4A, the generation of two new 
proteins of 56 kDA and 65 kDA which contain NS5A and 
NS5B, respectively. These proteins represent the 
product of proteolysis of the precursor containing NS5A 
and NS5B by NS3. Similarly, the 56 kDa and 65 kDa 
protein products, generated proteolytically from the 
NS5AB precursor, are obtained if the mRNA transcribed 
from pCite(NS3Aintl237-1635) is cotranslated with the 
mRNA translated from pCite(NS5AB) . 

This result can be summarized by stating that, in 
vitro, the protease domain of NS3 alone is not capable 
of exhibiting protease activity on a substrate 
containing NS5A and NS5B. However, the serine protease 
activity of NS3 becomes evident if another protein 
sequence containing NS4A is present in addition to the 
NS3 protease domain. 
' FiXAMPLE 1 

method qz ACTIVATION QE TEE HCZ protease hstng A 

SYNTHETIC PEPTIDE CONTAINING NS4A sequences 

A synthetic peptide containing the sequence SEQ ID 
NO: 3 was synthesized on solid phase. This sequence is 
derived from the C- terminal portion of SEQ ID N0:2. 
Synthesis of the peptide took place on solid phase 
according to processes known to those operating in this 
field. In this peptide, the carboxy terminal cysteine 
has been replaced with alpha- aminobutyric acid (Abu) . 

This peptide was added to an in vitro translation 
mixture simultaneosly programmed with the mRNAs 
transcribed from the plasraids pCite(NS3) and 
pCite(NSSAB) . 

It was thus possible to observe the proteolytic 
activity associated with the serine protease domain of 
NS3, which resulted in the proteolytic cleavage of the 
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substrate in the two products containing the proteins 



simultaneous presence of the NS3 serine protease domain 
and the synthetic peptide with the sequence SEQ ID NO: 3. 

EXAMPLE ± 

METHOD QE ASSAY QE A RECOMBINANT HOZ HS1 serine protease 

QE A PEPTIDE SUBSTRATE 

The plasmid pT7-7 NS3 (1027-1206) , described in 
figure 1 and in Example 4, was constructed in order to 
allow expression in E. coli of the protein fragment 
comprised between amino acid 1 and amino acid 180 of 
Seq.ID NO 1. Such fragment contains the serine protease 
domain of NS3 , as determined experimentally. The 
fragment of HCV cDNA coding for NS3 fragment just 
described was cloned in the pT7-7 plasmid, an expression 
vector that contains the T7 RNA polymerase promoter $ 10 
and the translation start site for the T7 gene 10 
protein (Studier and Moffatt, Use of bacteriophage T7 
RNA polymerase to direct selective high-level expression 
of cloned genes, (1986), J. Mol. Biol. 189, p. 113-130). 
The cDNA fragment coding for the NS3 serine protease 
domain as defined above was cloned downstream of the 
bacteriophage T7 promoter and in frame with the first 
ATG codon of the T7 gene 10 protein, using methods that 
are known to the molecular biology practice. The pT7-7 
plasmid also contains the gene for the IS- lactamase 
enzyme, which can be used as a marker of selection of E. 
coli cells transformed with plasmids derived with pT7-7. 

The plasmid pT7-7 NS3 (1027-1206) is then 
transformed in the E. coli strain BL2KDE53), which is 
normally employed for high-level expression of genes 
cloned into expression vectors containing T7 promoter. 
In this strain of E. coli, the T7- gene polymerase is 
carried on the bacteriophage XDE53, which is integrated 
into the chromosome of BL21. Expression from the gene of 
interest is induced by addition of 

isopropylthiogalactoside (IPTG) to the growth medium 



NS5A and NS5B . 



This activity is dependent on the 
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according to a procedure that has been previously 
described (Studier and Moffatt, Use of bacteriophage T7 
RNA polymerase to direct selective high-level expression 
of cloned genes, (1986), J. Mol. Biol- 189, p. 113-130). 

The recombinant NS3 fragment containing the serine 
protease domain could be purified from E. coli 
. BL21 (DE53) transformed with the plasmid pT7-7 NS3 (1027- 
1206) by the procedure summarized below. 

In brief, E. coli BL21 (DE53) cells harboring the 
pT7-7 NS3 (1027-1206) plasmid were grown at 37°C to an 
optical density at 600 nm of around 0.8 absorbance 
units. Thereafter, the medium was cooled down to 22°C 
and production of the desired protein induced by 
addition of IPTG to a final concentration of 0.4 mM. 
After 4-6 hours at 22°C in the presence of IPTG, cells 
were harvested and lysed by means of a French-pressure 
cell in a buffer containing 20 mM sodium phosphate pH 
6.5, 0.5% (w/v) (3- [ (3-cholamidopropyl) -dime thy lammonio] 
1-propanesulfonate (CHAPS), 50% (v/v) glycerol, 10 mM 
dithiothreitol and 1 mM EDTA (lysis buffer) . The cell 
debris was removed by centrifugation (l hour at 120000 x 
. g) and the resulting pellet resuspended in lysis buffer, 
digested with DNAse I, re -homogenized and re-centrofuged 
as described above. S-Sepharose Fast Flow ion exchange 
resin (Pharmacia) pre-equilibrated in lysis buffer was 
added to the pooled supernatants (30% v/v) and the 
slurry was stirred for 1 hour at 4°C. The resin was 
sedimented and washed extensively with lysis buffer and 
poured into a chromatography column. The NS3 protease 
was eluted from the resin by applying a 0-1 M NaCl 
gradient. The' protease -containing fractions equilibrated 
with 50 mM sodium phosphate buffer pH 7.5, 10% (v/v) 
glycerol, 0.5% (w/v) CHAPS and 2 mM dithiothreitol. The 
protein was 90-95% pure after this step. Purification to 
>98% was achieved by subsequent chromatography on 
Heparin Sepharose equilibrated with 50 mM Tris pH 7.5, 
.10% (v/v) glycerol, 0.5% (w/v) CHAPS and 2 mM 



SUBSTITUTE SHEET 



WO 95/22985 PCI7IT95/00018 

-11- 

dithiothreitol. Elution of the NS3 protease from this 
column was achieved by applying a linear 0-1 M NaCl 
gradient . 

The concentration of the purified protein was 
determined by the Bio-Rad protein assay (Bio-Rad cat. 
500-0006) . 

The recombinant NS3 serine protease produced 
according to the above procedure in E. coli could be 
assayed for activity by cleaving a substrate that 
provides detectable cleavage producs. The signal is 
preferably detectable by colorimetric or fluorometric 
means. Methods such as HPLC and the like are also 
suitable. 

For example, we used, as a substrate, synthetic 
peptides corresponding to the NS4A/4B junction of the 
HCV polyprotein and containing the aminoacid sequence 
SEQ ID NO: 4 or part of it. 

Alternatively, peptide esters, having the general 
structure indicated in SEQ ID NO: 5. 

The activity assay is performed by incubating 5- 
1000 nM substrate and 0.05-1 jiM protease in buffer 
containing 25 mM Tris/HCl pH 7.5, 3 mM dithiothreitol, 
0.5% (w/v) CHAPS and 10% (v/v) glycerol for 1-3 hours 
at 22 °C. The reaction is stopped by addition of 
trifluoracetic acid to yield a final concentration of 
0.1% (w/v) . 

The reaction products are then separated by HPLC on 
a C18 reverse phase column and quantitated according to 
their absorbance of the far UV light. 

The proteolytic activity displayed by recombinant 
NS3 serine protease purified from E. coli is very low 
when the activity assay is performed as described above. 
However, we found that increasing amounts of the 
synthetic peptide described in SEQ ID NO: 3 stimulate the 
proleolytic activity of the recombinant NS3 serine 
protease up to 20- fold. Maximal activity is reached when 
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the recombinant NS3 serine protease and the synthetic 
peptide are present in equimolar amounts. 

The assay described above can be used for the 
search of protease inhibitors. Because the activity of 
NS3 protease in such assay depends on the interaction of 
the NS3 serine protease domain with amino acid sequences 
derived from NS4A, it is also possible, by using the 
assay described above, to search for antagonists of the 
interaction between NS3 and NS4A that will ultimately 
inhibit the proteolytic activity associated with NS3. 

EXAMPLE 5. 

DETAILED CONSTRUCTION QE THE plasmtpr m THE sole ptgore 
pCite(NS3) contains the portion of the HCV-BK 
genome comprised between nucleotides 3351 and 5175 
(amino acids 1007-1615 of the polyprotein) . 
Construction of this plasmid has been described in L. 
Tomei et al f M NS3 is a serine protease required for 
processing of hepatitis C virus polyprotein" , J. Virol 
(1993) 67, 1017-1026. 

pCite (NS4B/5A) was obtained by cloning a Scal-BamHI 
fragment derived from the plasmid pCite (NS4-5) , 
described in Tomei et al, into pCite(NS3) that was 
previously digested with MscI and BamHI. pCite (NS4B/5A) 
contains the portion of the HCV genome comprised between 
nucleotides 5652 and 7467 (amino acids 1774-2380 of the 
polyprotein) . 

pCite(NSSAB) codes for a protein that comprises the 
sequence from amino acid 1965 to amino acid 3010 of the 
HCV-BK polyprotein. To construct this plasmid, the 
plasmid pCite(SX) described in Tomei et al (1993), 
supra, was first digested with Asel and treated with 
the Klenow fragment of the DNA polymerase. After 
inactivation of the Klenow enzyme, the plasmid was 
digested with Xbal. The resulting cDNA fragment, 
containing the region between nucleotides 6224 and 9400, 
was purified and inserted into the BstXI and Xbal sites 
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of the vector pCite-l*, after blunting the end generated 
by BstXI with T4 DNA polymerase. 

pCite (NS3/4A) was obtained as follows. A cDNA 
fragment, corresponding to the region between 
nucleotides 3711 and 5465 of the HCV-BK genome, was 
synthesized by means of polymerase chain reaction (PCR) 
using sequence-specific oligonucleotides as primers. A 
UAG stop codon was suitably included in the antisense 
oligonucleotide. After PCR amplification, the resulting 
cDNA was cleaved at the 5 1 end with SA1I and the product 
of 750 pairs of bases cloned directionally into the Sail 
and Nhel sites of the plasmid pCite(SX), after blunt - 
ending the Nhel end with the Klenow fragment of the DNA 
polymerase. The resulting plasmid codes for the portion 
of HCV-BK polyprotein comprised between amino acids 991 
and 1711. 

For the construction of pCite (NS4A) , a cDNA 
fragment, corresponding to the region between the 
nucleotides 5281 and 5465 of the HCV-BK genome (amino 
acids 1649-1711) , was obtained by polymerase chain 
reaction (PCR) amplification with sequence-specific 
oligonucleotides as primers. The cDNA resulting from 
the PCR amplification was subsequently cloned into the 
Bstxl and StuI sites of the plasmid pCite-l R , after 
blunt-ending the BstXI digested end with the DNA 
polymerase of the bacteriophage T4. 

pCite(NS3Aintl237-1635) is a derivative of 
pCite(NS3/4A) from which all the sequences comprised 
between nucleotide 4043 and nucleotide 5235 have been 
deleted. It was obtained by digesting pCite (NS3/4A) 
with Bstell and partially with Seal. The fragment 
containing the deletion of interest was then 
circularised by use of T4 DNA ligase. This plasmid codes 
for a protein that has the same amino- and carboxy- 
terminal ends as that encoded by pCite (NS3/4A) , but all 
the amino acid residues comprised between amino acid 
1237 and amino acid 1635, experimentally found to be 
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dispensible for the serine-protease NS3 activity, have 
been deleted. 

pT7-7 [NS3 (1027-1206) ] contains the HCV sequence 
from nucleotide 3411 to nucleotide 3951, encoding the 
HCV NS3 fragment comprised between amino acid 1027 and 
amino acid 1206. In order to obtain this plasmid, a DNA 
fragment was generated by amplification of HCV cDNA by 
the polymerase chain reaction (PCR) using the oligo 
nucleotides referred to as SEQ ID NO: 6 and SEQ ID NO: 7. 
The cDNA fragment obtained by PCR was phosphorylated, 
digested with Nde I and subsequently cloned downstream 
of the bacteriophage T7 promoter, following immediately 
the first ATG codon of the T7 gene 10 protein in the 
vector pT7-7 previously digested with Nde I and Sma I 
(Studier and Moffatt, Use of bacteriophage T7 RNA 
polymerase to direct selective high-level expression of 
cloned genes, (1986), J. Mol. Biol. 189, p. 113-130). It 
is to note that an amber codon was inserted immediately 
following the HCV-derived sequence. 
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SEQUENCE LISTING 



GENERAL INFORMATION 
(i) APPLICANT: ISTITUTO DI RICERCHE DI BIOLOGIA 
MOLECOLARE P. ANGELETTI S.p.A. 

(ii) TITLE OF INVENTION: METHOD FOR REPRODUCING IN 
VITRO THE PROTEOLYTIC ACTIVITY OF THE NS3 PROTEASE OF 
HEPATITIS C VIRUS (HCV) 

(iii) . NUMBER OF SEQUENCES: 7 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Society Italians Brevetti 

(B) STREET: Piazza di Pietra, 39 

(C) CITY: Rome 

(D) COUNTRY: Italy 

(E) POSTAL CODE: 1-00186 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 3.5" 1.44 MBYTES 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS Rev. 5.0 

(D) SOFTWARE: Microsoft Wordstar 4.0 

(viii) ATTORNEY INFORMATION 

(A) NAME: DI CERBO, Mario (Dr.) 
(C) REFERENCE: RM/X88350/PCT-DC 

(ix) TELECOMMUNICATION INFORMATION 
(A) TELEPHONE: 06/6785941 



(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 631 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: No 

(iv) ANT I SENSE : No 

(v) FRAGMENT TYPE: internal fragment 



(1) 



(B) TELEFAX: 06/6794692 

(C) TELEX: 612287 ROPAT 
INFORMATION FOR SEQ ID NO: 1: 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 
(C) ISOLATE : BK 

(vii) IMMEDIATE SOURCE: cDNA clone pCD (38-9.4) 
described by Tomei et al. in 1993 

(ix) FEATURE: 

(A) NAME: NS3 Serine Protease Domain 

(B) LOCATION: 1-180 

(C) IDENTIFICATION METHOD: Experimentally 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys 

15 io 15 

He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 

20 25 30 

Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val 

35 40 45 

Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu 

50 55 60 

Ala Ala Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Cal Asp Gin 
65 70 75 80 

Asp Leu Val Gly Trp Pro Lys Pro Pro Gly Ala Arg Ser Leu Thr Pro 

85 90 95 

Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp 

100 105 no 

Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 

115 120 125 

Pro Arg Pro Cal Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu 

130 135 140 

Cys Pro Phe Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr 
145 150 155 ISO 

Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu 

165 170 175 

Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala 

180 185 190 

Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser 

195 200 205 

Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys 
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210 



215 



220 



Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala 
225 230 235 240 

Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val 

245 250 255 

Arg Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys 

260 265 270 

Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie lie 

275 280 285 

Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie Gly 

290 295 300 

Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu 
305 310 315 320 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He 

325 330 335 

Glu Lgu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys 

340 345 350 

Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys 

355 360 365 

His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu 

370 375 380 

Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He 
385 390 395 400 

Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr 

405 410 415 

Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

420 425 430 

Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

435 440 445 

Thr Thr Val Pro Gin Aps Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

450 455 460 

Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu 
465 470 475 480 

Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 

485 490 495 

Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg 



500 



505 



510 
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Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

515 520 525 

Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

530 535 540 

His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 
545 550 555 560 

Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 

565 570 575 

Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu 

580 585 590 

His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu 

595 600 605 

Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser 

610 615 620 

Ala Asp Leu Glu Val Val Thr 
625 • 630 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: polypeptide 

(iii) HYPOTHETICAL: No 

(iv) ANTISENSE: No 

(v) FRAGMENT TYPE: Internal 

(vii) IMMEDIATE SOURCE: cDNA Clone (SEE SEQ ID NO:l) 
(ix) FEATURE: 

(A) NAME: NS4A Protein 

(C) IDENTIFICATION METHOD: Experimentally 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
1 5 10 15 

Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu Ser 

20 25 30 

Gly Arg Pro Ala He Val Pro Asp Arg Glu Leu Leu Tyr Gin Glu Phe 
35 40 45 
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Asp Glu Met Glu Glu Cys 
50 

(3) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: polypeptide 

(iii) HYPOTHETICAL: No 

(iv) ANT I SENSE : No 

(v) FRAGMENT TYPE: internal 
(vii) IMMEDIATE SOURCE: 

(A) SYNTHESIS: Solid phase peptide synthesis 
(ix) FEATURE: 

(A) NAME: Cof actor of NS3 serine protease 
(C) IDENTIFICATION METHOD: experimentally 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
Gly Ser Val Val He Val Gly Arg He He Leu Ser Gly Arg Pro Ala 
1 5 10 15 

He Val Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu 
20 25 30 

Glu Abu 
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CIAIMS 

1. A method for reproducing in vitro the proteolytic • 
activity of the HCV NS3 protease, characterized in that 
both sequences contained in NS3 and sequences contained 
in NS4A are used in the reaction mixture. 

2 . The method for reproducing in vitro the 
proteolytic activity of the HCV NS3 protease according to 
claim 1, in which NS4A is present in a ratio of 1:1 with 
respect to NS3 . 

3 . The method for reproducing in vitro the 
proteolytic activity of the HCV NS3 protease according to 
claim 1 or 2, in which NS3 and NS4A are incorporated in 
the reaction mixture as NS3-NS4A precursor, said 

. precursor generating, by means of an autoproteolytic 
event, equimolar amounts of NS3 and NS4A. 

4 . The method for reproducing in vitro the 
proteolytic activity of the HCV NS3 protease according to 
any one of the preceding claims, in which the cleavage 
site between NS3 and NS4A is mutated in a precursor, so 
that NS4A remains covalently bonded to NS3, it being 
subsequently possible to remove from said non 
proteolyzable precursor the sequences that do not 
influence the proteolytic activity of NS3 . 

5 . A composition of matter, characterized in that it 
contains NS3 and NS4A sequences according to claims 1 to 
4. 

$. The composition of matter according to claim 5, 
comprising the proteins whose sequences are described in 
SEQ ID N0:1 and SEQ ID NO: 2 or sequences contained 
therein or derived therefrom. 

7, Use of the compositions of matter according to 
claims 5 and 6 to set up an enzymatic test capable of 
selecting, for therapeutic purposes, compounds that 
inhibit the enzymatic activity associated with NS3 . 

8. Method for reproducing in vitro the proteolytic 
activity of the HCV NS3 protease, compositions of matter 
and use of said compositions of matter to set up an 
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enzymatic test capable of selecting, for therapeutic 
purposes, compounds that inhibit the enzymatic activity 
associated with NS3, according to the above description, 
examples and claims . 
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